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ABSTRACT 


An  ensemble  eonsisting  of  150  Ziphius  eavirostris  voealizations  was  eompiled 
from  aeoustie  data  reeorded  at  two  High-frequeney  Aeoustie  Reeording  Paekage  (HARP) 
loeations:  the  Naval  Postgraduate  Sehool  (NPS)’s  Point  Sur  HARP  and  Seripps 
Institution  of  Oeeanography  (SIO)’s  site  H  HARP.  The  ensemble  was  analyzed  via  a 
prineipal  component  analysis  (PCA).  The  results  of  the  PCA  verified  the  statistical 
robustness  of  the  signal  and  yielded  one  dominant  mode  which  accounted  for  73%  of  the 
variance.  The  dominant  mode  was  used  to  create  a  kernel  for  a  matched  filter  detection 
scheme.  The  subsequent  detector  output  was  statistically  evaluated  against  a  ground 
truth.  The  ground  truth  identified  28,434  Ziphius  clicks  by  visually  inspecting  over  170 
minutes  of  data  recorded  by  NPS’s  Data  Acquisition  System  (DAS)  at  the  Southern 
California  Offshore  Range  (SCORE).  The  inability  to  visually  discriminate  a  signal 
embedded  in  noise  created  a  conservatively  biased  ground  truth  estimate  which  increased 
the  detector’s  false  alarm  rate.  At  an  acceptable  0.1%  false  alarm  rate,  the  detector  had 
an  overall  44%  probability  of  detection.  A  further  assessment  of  the  detector’s 
performance  divided  the  data  into  two  categories:  cluttered  and  uncluttered.  At  a  false 
alarm  rate  of  0.1%,  the  probability  of  detection  was  26%  and  61%,  respectively. 
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I.  INTRODUCTION 


A,  BACKGROUND 

In  an  ongoing  legal  dispute  with  the  National  Resources  Defense  Council 
(NRDC),  the  U.S.  District  Court  for  the  Central  District  of  California  has  imposed 
restrictions  which  consequently  affect  the  Navy’s  ability  to  operate  Mid-Frequency 
Active  (MFA)  sonar.  MFA  sonar  has  been  operated  for  over  60  years  and  is  the  primary 
method  to  localize  submarines  (Hastings,  2008).  These  legal  implications  impede  the 
combat  proficiency  and  advancement  of  the  U.S.  Navy’s  Pacific  Fleet’s  top  priority:  anti¬ 
submarine  warfare  (ASW).  On  January  23,  2007,  under  Title  16,  Section  1371(1)  of  the 
U.S.  Code,  the  Deputy  Secretary  of  Defense  invoked  a  two-year  National  Defense 
Exemption  (NDE)  under  the  Marine  Mammal  Protection  Act  (MMPA)  which  includes  29 
mitigation  measures.  These  29  mitigation  measures  were  developed  along  with  the 
National  Marine  Eisheries  Service  (NMES)  to  reduce  the  potential  impacts  of  MEA  sonar 
on  marine  mammals  through  increased  aerial  monitoring  and  visual  surveying  (Eederal 
Register,  2008). 

Recent  mass  stranding  incidents  involving  beaked  whales,  both  temporally  and 
geographically  coincident  with  naval  emissions  of  underwater  sound,  coupled  with  these 
high-profile  legal  ramifications  have  increased  the  need  for  more  effective  methods  of 
detection  and  classification.  Cuvier’s  beaked  whales  {Ziphius  cavirostris)  are  among 
those  of  greatest  concern  with  respect  to  curtailing  the  potential  effects  from 
anthropogenic  sound  (Zimmer  et  ah,  2005;  Cox  et  ah,  2006).  Since  1960,  more  than  40 
mass  strandings  of  Cuvier’s  beaked  whales  have  been  reported  worldwide  (Cox  et  ah, 
2006).  This  species,  alone,  comprises  over  80  percent  of  all  marine  mammals  involved  in 
stranding  incidents  (Hildebrand,  2005).  Eurther  amplifying  the  issue,  research  and 
knowledge  of  this  species  is  severely  limited.  Cuvier’s  beaked  whales  are  difficult  to 
study  and  identify  via  traditional  visual  surveying  techniques  due  to  the  nature  of  their 
lengthy  deep-diving  behavioral  pattern,  typically  spending  up  to  40  minutes  beneath  the 
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surface  of  the  water  for  a  single  dive  (Barlow  et  ah,  2006).  Cuvier’s  beaked  whales 
spend  less  than  3  minutes  at  the  surfaee  between  dives,  leaving  not  much  time  for  visual 
identifieation  (Barlow,  1999).  This  species  has  also  been  observed  to  surfaee  without  any 
visible  blow  or  splash  (Ferguson  et  ah,  2006).  In  an  experiment  utilizing  aeoustic 
reeording  tags  (DTAGs)  attaehed  to  a  Cuvier’s  beaked  whale,  the  average  depth  recorded 
during  a  deep  diving  period  was  approximately  850m,  with  voealizations  eeasing  when 
the  whale  was  within  200  meters  of  the  surfaee  (Johnson  et  ah,  2004;  Tyaek  et  ah,  2006). 
The  aeeuraey  of  visual  identifieation  is  further  limited  by  many  additional  faetors 
ineluding;  sea  state,  visibility,  daylight,  and  the  individual  observer’s  experienee  level 
and  biases.  The  development  of  an  automated  passive  aeoustie  deteetor  would  provide 
the  U.S.  Navy  with  the  eapacity  to  observe  this  speeies’  presenee  and  movement  under 
eonditions  not  appropriate  for  visual  surveys.  Furthermore,  passive  aeoustic  techniques 
are  more  cost-effective,  require  less  underway  time,  allow  for  eontinuous  monitoring,  and 
eould  provide  information  on  seasonal  and  diurnal  population  patterns. 

B,  THESIS  OBJECTIVES 

There  are  two  primary  objeetives  for  this  thesis.  The  first  objective  is  to  develop 
a  kernel  for  the  voealizations  of  Cuvier’s  beaked  whales.  This  will  be  achieved  by 
eonducting  a  principal  component  analysis  (PCA)  upon  an  ensemble  of  extracted  Ziphius 
elicks.  The  kernel  will  then  be  used  in  an  automated  passive  aeoustie  matched  filter 
deteetion  scheme. 

The  seeond  objeetive  of  this  thesis  is  to  assess  the  performanee  of  the  automated 
passive  aeoustic  detector.  This  will  be  achieved  by  first  ereating  a  ground  truth  eount  of 
Ziphius  vocalizations.  Then,  Reeeiver  Operating  Charaeteristie  (ROC)  curves  portraying 
the  deteetor’ s  performanee  will  be  eonstrueted  via  a  statistieal  eomparison  of  the  ground 
truth  to  the  detector  output. 
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c. 


OUTLINE 


The  remainder  of  this  thesis  eonsists  of  three  ehapters.  Chapter  II  describes  the 
methods  used  to  achieve  the  two  primary  objectives.  To  create  the  kernel,  a  principal 
component  analysis  was  conducted  upon  an  ensemble  comprised  of  150  randomly 
selected  Cuvier’s  beaked  whale  vocalizations.  To  assess  the  detector’s  performance,  a 
ground  truth  was  created  by  visually  reviewing  174.8  minutes  of  the  Naval  Postgraduate 
School  (NPS)’s  Data  Acquisition  System  (DAS)  recordings  for  the  Southern  California 
Offshore  Range  (SCORE)  and  identifying  28,434  occurrences  of  a  Ziphius  click.  Chapter 
III  contains  the  ROC  curves  and  a  discussion  of  the  automated  passive  acoustic  detector’s 
performance  relative  to  the  ground  truth.  Chapter  IV  presents  the  conclusions  of  this 
thesis. 
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II.  METHODOLOGY 


A,  KERNEL  DEVELOPMENT 

1,  Signal  Characterization 

A  characterization  of  the  Ziphius  signal,  derived  via  recent  researeh  results, 
initiated  the  kernel  developmental  process.  Although  some  odontoeetes,  toothed  whales, 
produee  elicks  and  whistles  during  voealization,  Cuvier’s  beaked  whales  are  known  only 
to  eliek  (Hildebrand,  2005).  Reeent  researeh  suggest  the  clicks  of  Cuvier’s  beaked 
whales  exhibit  a  unique  spectral  and  temporal  strueture  that  differs  signifieantly  from  the 
recordings  of  other  non-ziphid  toothed  whales.  A  unique  signal  is  favorable  for 
automated  aeoustic  monitoring. 

In  September  of  2003,  researeh  condueted  by  attaching  a  digital  acoustic 
recording  tag  (DTAG)  directly  to  a  whale  reported  a  eliek  duration  of  175  ps  with  an 
interclick  interval  (ICI)  of  0.4  seeonds.  The  speetrum  swept  upwards  from  30  to  48  kHz 
(Johnson  et  ah,  2004).  One  year  later,  NATO  Undersea  Research  Center  (NURC)  and 
Woods  Hole  Oeeanographie  Institution  (WHOI)  collaborated  in  a  concentrated  attempt  to 
build  upon  the  sparse  knowledge  of  Cuvier’s  beaked  whales.  Their  researeh  found  a 
eliek  duration  of  200  ps  and  an  average  ICI  of  0.4  seeonds.  The  speetrum  was  frequency 
modulated  (FM)  and  swept  upwards  from  35  to  45  kHz  (Zimmer  et  ah,  2005).  However, 
it  should  be  noted  that  both  of  these  studies  used  acoustic  recording  devices  with  a  cutoff 
frequency  of  48  kHz;  and  hence,  no  information  is  provided  for  the  higher  frequency 
limit  of  click  energy.  Consequently,  the  eliek  durations  are  shortened  and  the  bandwidths 
are  narrowed. 

On  September  26,  2005,  further  researeh  by  NURC,  utilizing  a  towed  array, 
reinforeed  and  expanded  upon  the  DTAG  Ziphius  signal  characterization.  This  recording 
method  was  able  to  eapture  the  entire  bandwidth  of  the  signal.  A  Passive  Acoustic 
Monitoring  (PAM)  system  with  a  bandwidth  of  96  kHz  was  activated  after  a  visual 
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sighting  of  two  Cuvier’s  beaked  whales  initiating  a  deep  dive.  The  PAM  recordings 
depicted  an  upswept  energy  range  of  16  to  60  kHz  with  a  center  frequency  of  40  kHz. 
The  click  duration  was  approximately  300  ps  with  an  average  ICI  of  0.38  s  (Pavan  et  ah, 
2006).  These  results  mark  the  first  time  a  sub-surface  detection  device  was  able  to  verify 
characteristic  features  of  the  DTAG  recordings.  The  increase  in  the  signal’s  duration  and 
bandwidth  is  explained  by  the  increased  bandwidth  of  the  recording  method. 

Additional  DTAG  research  indicates  significant  differences  in  signal 
characteristics  between  Cuvier’s  beaked  whales  and  other  toothed  whales.  The  Ziphius 
signal  was  characterized  by  an  upswept  FM  pulse,  an  average  click  duration  of  200  to 
300  ps,  and  an  ICI  of  0.4  s  (Tyack  et  ah,  2006).  Overall,  recent  research  indicates  a 
unique  signal  structure  that  is  favorable  for  automated  acoustic  detection. 

2.  Ensemble  Creation 

Designed  specifically  to  monitor  marine  mammals,  the  High-frequency  Acoustic 
Recording  Package  (HARP)  was  developed  by  the  Scripps  Institute  of  Oceanography 
(SIO).  The  HARP,  which  is  capable  of  a  200  kHz  sampling  rate  and  nearly  2  TB  of  data 
storage  per  instrument  deployment,  is  ideal  for  recording  Ziphius' s  higher  frequency 
clicks  over  long  periods  of  time.  For  recordings  made  at  a  sampling  rate  of  200  kHz,  55 
days  of  continuous  recording  is  possible  (Wiggins  and  Hildebrand,  2007).  To  study  the 
signals  of  Cuvier’s  beaked  whales,  long-term,  broad-band,  underwater  acoustic  data 
recorded  via  a  HARP,  was  obtained  from  two  different  locations  with  known  Ziphius 
activity:  Point  Sur  and  San  Nicolas  basin.  The  NPS  Point  Sur  HARP  is  moored  at:  36 
17.95'  N,  122  23.63'  W,  approximately  40  km  off  the  central  coast  of  California  at  a 
water  depth  of  1390  meters.  Acoustic  data  recorded  during  the  NPS  Point  Sur  HARP’s 
second  deployment,  which  spanned  from  24JAN07  until  17JUL07,  comprised  one  half  of 
the  data  that  was  used  in  the  ensemble  creation.  SIO  provided  data,  spanning  from 
22AUG07  to  24AUG07,  from  their  Site  H  HARP,  located  just  east  of  the  San  Nicholas 
Basin  at  a  depth  of  1013  meters. 
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To  evaluate  the  statistical  robustness  of  the  signal,  two  ensembles  were  randomly 
extracted  from  the  HARP  data  sets:  one  a  compilation  of  Ziphius  clicks,  the  second  a 
compilation  of  ambient  noise  segments.  The  click  ensemble  will  be  used  to  generate  a 
kernel  which  contains  the  statistically  dominant  characteristics  of  the  signal.  The  noise 
ensemble  will  duplicate  the  statistical  analyses  performed  on  the  click  ensemble  to  ensure 
that  the  ambient  noise  is  not  correlated.  Triton  software,  courtesy  of  Wiggins  (personal 
communication),  was  used  to  visually  inspect  the  data  and  extract  150  random 
vocalizations,  following  the  qualitative  characterizations  of  the  Ziphius  click  from 
previous  research.  The  total  ensemble  of  clicks  consisted  of  75  samples  from  each 
HARP  location.  An  example  of  one  click  extraction  from  each  location  is  shown  in 
Figure  1.  The  100  sample  ensemble  of  random  ambient  noise  segments  consisted  of  50 
noise  segments  from  each  HARP  dataset. 


Figure  1.  Two  examples  of  Ziphius  clicks  extracted  from  HARP  data:  a)  Spectrogram 
of  a  click  extracted  from  NPS’s  Point  Sur  HARP  with  click  energy  upsweeping 
from  35  to  50  kHz.  b)  Time  series  of  a  click  corresponding  to  the  spectrogram 
above  it.  c)  Spectrogram  of  a  click  extracted  from  SIO’s  Site  H  HARP  with  click 
energy  upsweeping  from  35  to  50  kHz.  (d)  Time  series  of  a  click  corresponding 

to  the  spectrogram  above  it 
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Although  both  ensembles  were  ereated  solely  from  HARP  data,  the  subsequent 
analyses  had  to  aeeount  for  the  bandwidth  and  sampling  frequeney  differenees  between  a 
HARP  and  a  SCORE  hydrophone  in  order  to  produee  results  that  would  be  applicable  for 
data  from  either  acoustic  recording  method.  A  SCORE  hydrophone  has  a  band-pass  filter 
installed  which  limits  the  frequency  response  to  a  bandwidth  of  8-40  kHz.  At  the  time  of 
the  data  collection,  the  sampling  rate  of  the  SCORE  hydrophone  was  set  at  80  kHz; 
whereas,  the  HARP  was  set  at  200  kHz.  To  ensure  consistency  between  recording 
methods,  the  ensembles  were  processed  into  two  distinct  sub-sets. 

In  order  make  the  ensembles  applicable  to  a  SCORE  hydrophone,  the  first  step 
was  to  decrease  the  sampling  rate  from  200  kHz  to  80  kHz.  To  ensure  the  bandwidth  was 
consistent  with  a  SCORE  hydrophone,  a  band-pass  filter  with  pass  bands  of  15-40  kHz 
and  was  applied  to  the  ensemble.  Fifteen  kHz  was  used  as  the  lower  pass  band  to 
eliminate  noise  that  existed  at  frequencies  lower  than  the  Ziphius  signal.  The  sampling 
rate  of  the  ensembles  was  then  increased  to  a  sampling  frequency  of  1  MHz,  to  increase 
the  resolution  and  decrease  the  potential  for  correlation  quantization  errors.  For  clarity, 
this  first  sub-set  will  be  referred  to  as  the  ensembles  that  were  band-pass  filtered  between 
15-40  kHz. 

The  ensembles  were  also  processed  for  applicability  to  HARP  data  in  a  second 
sub-set.  Both  ensembles  were  band  passed  between  15-60  kHz  to  eliminate  noise  at 
frequencies  lower  than  the  Ziphius  signal.  The  higher  pass  band  of  60  kHz  allows  for  the 
inclusion  of  more  high-frequency  click  energy.  As  in  the  first  sub-set,  the  sampling 
frequency  was  increased  for  the  click  and  noise  segment  ensembles  to  a  rate  of  IMHz. 
This  second  sub-set  will  be  referred  to  as  the  ensembles  that  were  band-pass  filtered 
between  15-60  kHz. 

3,  Quantitative  Signal  Evaluation 

A  correlation  analysis  was  performed  on  both  sub-sets  of  ensembles  to  assess  the 
statistical  robustness  of  the  Ziphius  signal  and  evaluate  the  feasibility  for  the  development 
of  an  automated  detector.  First,  the  150  samples  of  both  click  ensembles  were  demeaned. 
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normalized,  and  aligned  via  eireular  shifting.  For  both  sub-sets  of  eliek  ensembles,  the 
click  to  click  cross-correlation  results  indicated  a  statistically  high  level  of  correlation 
among  the  samples.  Figure  2  depicts  the  values  for  the  150  click  to  click  cross¬ 
correlations  for  the  click  ensemble  that  was  band-pass  filtered  between  15-40  kHz.  These 
results  indicate  that  the  signal  is  statistically  robust.  A  cross-correlation  was  also 
performed  on  each  sub-set’s  ensemble  of  random  noise  segments  to  ensure  that  the  noise 
was  not  correlated.  For  both  of  the  sub-set’s  noise  segment  ensembles,  the  noise  to  noise 
cross-correlation  results  indicated  a  statistically  low  level  of  correlation  among  the 
samples. 


Click  to  Click  Cross-Correlation  for  Ensemble  BPF  15-40  kHz 


20  40  60  80  100  120  140 

Ensemble  Clicks 


Figure  2.  Click  to  click  cross-correlation  results  for  the  click  ensemble  that  was  band¬ 
pass  filtered  between  15-40  kHz:  The  correlation  values  range  from  0  to  1,  with  a 
value  of  1  indicating  a  perfect  correlation. 
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4,  Principal  Component  Analysis 


Once  the  signal  was  determined  to  be  quantitatively  robust,  both  sub-sets  of  cliek 
ensembles  were  analyzed  via  a  principal  component  analysis  (PCA)  to  further  evaluate 
the  potential  for  a  matehed  filter  deteetion  seheme.  The  goal  of  the  PCA  is  to  isolate  the 
desired  signal  from  the  noise.  A  PCA  is  a  useful  statistieal  teehnique  that  was  invented  in 
1901  by  Karl  Pearson.  PCA  is  defined  as  an  orthogonal  linear  transformation  that 
eonverts  data  into  a  new  eoordinate  system  sueh  that  the  greatest  varianee  of  the  data 
eomes  to  lie  on  the  first  eoordinate,  often  referred  to  as  the  prineipal  eomponent  (Shaw, 
2003).  The  seeond  varianee  ranking  lies  on  the  seeond  eoordinate,  the  third  variance 
ranking  lies  on  the  third  eoordinate,  and  so  on.  This  method  of  decompressing  the  data 
makes  it  possible  to  retain  the  characteristics  of  the  signal  that  eontribute  most  to  its 
varianee  by  keeping  the  eomponents  with  the  highest  varianee  and  ignoring  the 
eomponents  with  the  least  amount  of  varianee. 

Mathematically,  the  prineipal  eomponent  ean  be  obtained  by  solving  the 
following  eigenvalue-eigenveetor  equation: 

AA"^  Vi  =  Vi  Vi  (1) 

where,  A  is  a  data  matrix  with  150  eolumns,  and  eaeh  eolumn  eontains  one  realization  of 
a  realigned  eliek  from  the  ensemble.  A  A^  is  the  data  eovarianee  matrix.  Vi  is  the 
eigenvalue  whieh  is  the  varianee  resolved  by  the  ith  eomponent,  Vj, 

The  PCA  was  performed  via  a  Matlab  routine  to  yield  the  components  and  the 
assoeiated  variances.  For  the  eliek  ensemble  that  was  band-pass  filtered  between  15-40 
kHz,  the  PCA’s  first  eomponent  eontained  73%  of  the  varianee.  The  remaining 
eomponents  all  had  values  of  less  than  6%  and  eorrespond  to  noise  ineluding  multipath 
eontamination.  The  results  of  this  PCA  indieate  that  there  is  only  one  dominant 
eomponent.  The  results  of  the  PCA  for  the  first  sub-set  are  shown  in  Figure  3. 
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Principal  Component  Analysis  for  the  Click  Ensemble  BPF  15-40  kHz 


Figure  3.  Principal  component  analysis  results  for  the  click  ensemble  that  was  band¬ 
pass  filtered  between  15-40  kFlz:  The  dominant  component  is  emphasized  with  a 
red  circle  and  contains  73%  of  the  variance. 

The  second  sub-set,  which  was  band-pass  fdtered  between  15  and  60  kHz,  also 
produces  only  one  dominant  component.  The  first  component  of  the  second  subset’s 
PCA  contains  66%  of  the  variance.  The  remaining  components  correspond  to  noise 
including  multipath  contamination.  Both  PCAs  indicate  that  the  first  component  can  be 
used  as  a  kernel  in  a  matched-filter  detection  scheme.  The  first  components  of  each 
subset’s  PCA  were  extracted  to  be  used  as  kernels,  shown  in  Figure  4.  The  first  subset’s 
kernel  is  noticeably  shorter  in  duration  than  that  of  the  second  sub-set.  This  is  because  it 
was  created  from  a  click  ensemble  with  a  narrower  bandwidth,  15-40  kHz  vice  15-60 
kHz;  and  therefore,  some  of  the  higher  frequency  click  energy  was  excluded.  The 
variance  of  the  first  sub-set’s  kernel  is  also  higher,  73%  in  comparison  to  66%  of  the 
second  sub-set’s  kernel.  This  is  because  the  first  sub-set  was  created  from  a  narrower 
bandwidth;  therefore,  there  was  less  in-band  noise. 
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a)  Kernel  from  PCA  BPF  15^0  kHz 
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b)  Kernel  from  PCA  BPF  15-60  kHz 


Figure  4.  Kernels  developed  for  use  in  a  matched-filter  detection  scheme:  a)  Kernel 
created  from  the  PCA  of  the  click  ensemble  that  was  band-pass  filtered  between 
15-40  kHz  can  be  used  as  a  kernel  with  SCORE  hydrophone  data,  b)  Kernel 
created  from  the  PCA  of  the  click  ensemble  that  was  band-pass  filtered  between 
15-60  kHz  can  be  used  as  a  kernel  with  HARP  data. 

This  thesis  does  not  utilize  or  assess  the  kernel  created  from  the  second  sub-set  of 
data  that  was  band-pass  filtered  between  15-60  kHz.  Follow-on  research  would  be 
valuable  in  assessing  the  performance  of  this  kernel  in  a  matched-filter  detection  scheme 
for  comparison  to  the  performance  of  the  first  sub-set’s  kernel.  From  this  point  forward, 
all  references  to  a  kernel  are  with  respect  to  the  first  sub-set  of  data  that  was  band-pass 
filtered  between  15-40  kHz. 

One  final  analysis  was  performed  to  further  investigate  the  robustness  of  the 
Ziphius  click  and  evaluate  the  performance  of  the  kernel  on  the  click  ensemble:  a  cross¬ 
correlation  of  the  kernel  to  the  entire  click  ensemble.  This  cross-correlation  is  shown  in 
Figure  5.  The  majority  of  the  clicks  within  the  ensemble  are  highly  correlated  to  the 
kernel.  The  cross-correlation  of  the  kernel  to  click  5 1  produces  a  high  correlation  value 
with  only  one  dominant  arrival  and  represents  minimal  multipath  effects.  However,  it 
should  be  noted  that  a  few  of  the  clicks  have  a  lower  cross-correlation  coefficient.  For 
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example,  click  98  is  not  as  highly  correlated  to  the  kernel.  This  particular  cross¬ 
correlation  example  clearly  indicates  multiple  peaks  which  can  be  attributed  to  multipath 
effects.  The  normalization  of  the  signal  in  the  presence  of  multipath  arrivals  is 
responsible  for  decreasing  the  correlation  coefficient.  Without  normalization,  the  peak 
correlation  value  for  this  particular  example  would  be  consistent  with  the  higher  values  of 
the  other  cases.  Overall,  the  results  of  the  kernel  to  click  ensemble  cross-correlation  are 
further  evidence  that  the  Ziphius  click  is  a  robust  signal,  and  signifies  that  a  kernel  can 
feasibly  be  used  in  a  matched-filter  detection  scheme. 

A  cross-correlation  of  the  kernel  to  the  noise  segment  ensemble  was  also 
performed.  All  of  the  noise  segments  were  poorly  correlated  to  the  kernel.  This  is  an 
expected  result  and  verifies  that  the  high  correlation  coefficients  of  the  kernel  to  click 
ensemble  are  not  a  coincidence. 


a)  Cross-Correlation  of  Kerne!  to  Click  Ensemble 


Figure  5.  Cross-Correlation  of  the  Kernel  to  the  Click  Ensemble;  a)  A  majority  of  the 
correlation  coefficients  indicate  that  the  kernel  is  highly  correlated  to  the  clicks  of 
the  ensemble,  b)  Click  51  is  highly  correlated  to  the  kernel,  c)  Click  98  is  poorly 

correlated  to  the  kernel. 
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B. 


AUTOMATED  MATCHED  FILTER  DETECTOR  SCHEME 


The  statistically  dominant  first  component  produced  via  the  PCA  can  be  used  as  a 
kernel  in  a  matched  filter  detection  scheme.  The  kernel  was  cross-correlated  with 
acoustic  data  obtained  from  NFS’s  Data  Acquisition  System  (DAS)  recordings  at 
SCORE,  using  a  matched  filter  detector  designed  by  Chris  Miller  (personal 
communication)  of  NFS’s  Ocean  Acoustic  Laboratory  (OAL).  The  SCORE  data  that  was 
fed  into  this  detector  came  from  a  hydrophone  at  a  depth  of  1 ,497  meters  and  located  at 
32  50.62'  N,  1 19  5.26'  W  in  the  San  Nicolas  Basin. 

The  automated  passive  acoustic  matched-filter  detection  schematic  is  portrayed  in 
Figure  6.  The  first  step  in  Miller’s  detector  was  to  cross-correlate  the  kernel  with  the 
SCORE  data.  Then,  the  output  of  this  first  box  was  peak  picked  above  a  given  threshold. 
The  final  box  of  Miller’s  detector  utilized  a  rank-ordered  culling  system  with  a  culling 
window  of  +/-  390  ps.  The  culling  window  size  of  +/-  390  ps  was  selected  because  it  is 
exactly  twice  the  length  of  the  kernel.  This  step  removes  a  majority  of  the  multipath 
effects  as  well  as  the  side  lobes  that  were  introduced  by  the  correlation  and  the  sinusoidal 
nature  of  the  detector  kernel.  Removing  the  side  lobes  by  culling  the  data  cleans  up  the 
output  and  significantly  reduces  the  number  of  false  alarms. 


Figure  6.  Automated  passive  acoustic  matched  filter  detection  schematic:  The  design 
and  development  of  the  detector  are  courtesy  of  Miller  (personal  communication). 
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c. 


GROUND  TRUTH  CREATION 


1.  Selection  Criteria 

To  evaluate  the  performanee  of  the  detector,  the  detector  output  must  be 
compared  to  an  assumed  ground  truth.  The  ground  truth  was  created  by  visually 
inspecting  SCORE  data  and  annotating  each  instance  of  an  observed  Ziphius  click.  174.8 
minutes  of  acoustic  data  recorded  on  23FEB08  by  a  SCORE  hydrophone  was  reviewed  in 
the  ground  truth  creation  process.  The  duration  of  a  Ziphius  signal  is  less  than  400  ps; 
therefore,  the  time  scale  used  to  visually  review  the  SCORE  data  was  divided  into  12,800 
smaller  segments,  each  with  a  length  of  0.82  s.  The  final  log  of  presumably  positive 
Ziphius  vocalizations  was  then  used  to  statistically  analyze  the  automated  detector’s 
performance  via  probabilistic  means  comparing  hits,  false  alarms,  and  misses  at  varying 
threshold  levels. 

The  ground  truth  creation  proved  to  be  the  most  arduous  and  time-consuming 
aspect  of  this  research.  Even  at  a  decreased  time  scale,  the  certainty  of  the  ground  truth 
remained  dependant  upon  discernment.  An  initial  ground  truth  was  deliberately 
discarded;  because,  as  the  ground  truth  creation  process  progressed,  the  experience  level 
and  the  signal  discrimination  improved  and  unacceptable  inconsistencies  became 
inherent. 

The  successive  ground  truth  creation  process  incorporated  specific  criteria  to 
alleviate  subjectivity.  The  first  criterion  mandated  that  a  click  selected  for  inclusion  in 
the  ground  truth  must  have  continuous  energy  between  22.5  and  35  kHz.  This  standard 
was  adopted  under  the  notion  of  continuous  eye  integration,  meaning  the  eye  has  the 
ability  to  visually  connect  miniscule  gaps  within  the  click  energy  of  the  spectrogram.  If 
the  first  condition  was  not  met,  the  second  criterion  directed  ground  truth  inclusion  if  a 
click  was  part  of  a  distinctive  click  train,  consisting  of  regular,  repeated  clicks  with  a 
constant  ICE  Figure  7  exemplifies  an  instance  in  which  the  energy  criterion  was  not  met; 
however,  a  distinctive  click  train  was  present.  Thus,  the  clicks  not  spanning  22.5  to  35 
kHz  were  still  included  in  the  ground  truth  having  met  the  second  criterion.  The  energy 
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criterion  is  visibly  enhanced  with  the  overlaying  of  two  solid  black  lines  on  the 
spectrogram.  The  dashed  blue  lines  on  the  spectrogram  and  the  blue  stars  on  the  time 
series  indicate  identified  clicks.  The  final  criterion  established  that  only  one  click  would 
be  selected  in  the  case  of  a  cluster.  A  cluster  consisted  of  multiple  clicks  that  were 
visually  indistinguishable  from  one  another  at  the  prescribed  time  scale.  These  subjective 
criteria  allowed  for  the  creation  of  a  more  objective  ground  truth.  In  total,  28,434  clicks 
were  identified  in  the  174.8  minutes  of  data  that  was  reviewed. 
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Figure  7.  Ground  truth  creation  example:  The  upper  panel  is  a  spectrogram,  and  within 
it  the  energy  criterion  is  exemplified  by  the  solid  black  lines  spanning  22.5-35 
kHz.  The  lower  panel  is  the  corresponding  time  series  of  the  SCORE  data.  The 
click  energy  does  not  span  the  entire  width  of  the  energy  criterion;  however,  there 
is  a  distinctive  click  train.  The  dashed  blue  lines  on  the  upper  panel  and 
corresponding  blue  stars  on  the  lower  panel  represent  the  identification  of  a  click 
utilizing  the  second  criterion,  which  directs  the  selection  of  a  click  if  it  is  part  of 
distinct  click  train.  This  is  also  an  example  of  a  time  period  where  the  Ziphius 
click  was  able  to  be  distinguished  among  competing  signals.  Time  periods  such 
as  this  were  designated  as  “clutter”  for  the  subsequent  statistical  analysis. 
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2,  Statistical  Analysis  Exclusions 

The  eventual  detector  output  was  biased  with  respect  to  the  ground  truth  creation. 
The  acoustic  signatures  of  other  marine  mammals  occur  approximately  within  the  same 
frequency  range  as  that  of  a  Cuvier’s  beaked  whale.  Figure  8  (National  Resources 
Council,  2003)  depicts  these  overlapping  frequency  ranges  of  vocalizations.  Visual 
surveys  conducted  from  July  2006  to  April  2007  identified  several  of  these  species  of 
marine  mammals  with  overlapping  vocalization  frequencies  in  the  SCORE.  In  addition 
to  Cuvier’s  beaked  whales:  Risso’s  dolphins,  Pacific  white-sided  dolphins.  Sperm 
whales,  Orcas,  Baird’s  beaked  whales.  False  killer  whales,  and  Humpback  whales  have 
all  been  found  in  the  SCORE  (Hildebrand,  2007). 
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Figure  8.  Representative  vocalizations  of  marine  mammals  (National  Resources 

Council,  2003):  Tonal  vocalizations  are  plotted  in  red;  impulsive  vocalizations 
are  plotted  in  blue.  The  thicker  lines  represent  frequencies  near  maximum  energy 
and  the  thinner  lines  indicate  the  total  range  of  frequencies.  The  numbers  above 
the  line  indicate  measured  source  levels  in  dB  re  pPA  at  Im. 
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The  overlapping  frequency  ranges  of  other  marine  mammals  and  Cuvier’s 
vocalizations  create  uncertainty  within  the  ground  truth.  To  diminish  this  uncertainty,  the 
time  periods  containing  such  indiscriminant  signals  were  purposefully  excluded  from  the 
subsequent  statistical  analysis.  Similarly,  time  periods  in  which  the  data  recordings  were 
interrupted  and/or  turned  off  were  also  eliminated.  Figure  9  is  an  example  of  a  time 
period  that  was  deliberately  removed  from  the  ground  truth  due  to  its  indistinguishable 
clutter.  A  total  of  28.89  minutes  were  selected  to  be  excluded  from  the  statistical 
analysis. 


X  ■yQ‘*  Cluttered  Exclusion  Spectrogram  for  Record  133:  Window  187:  Start  Time:  02-23'2008  09:46:59.189 
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Figure  9.  Ground  truth  exclusion  due  to  indistinguishable  clutter:  The  upper  panel  is  a 
spectrogram  and  the  lower  panel  is  the  corresponding  time  series  for  the  data  that 
was  reviewed  to  create  the  ground  truth.  This  is  an  example  where  the  signal  was 
indistinguishable  due  to  the  presence  of  other  marine  mammals’  vocalizations. 
Time  periods  such  as  these  were  excluded  from  the  statistical  analysis  because  the 
Ziphius  signal  could  not  be  visually  distinguished  from  amongst  the  clutter. 
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3. 


Clutter  Categories 


To  further  remove  uncertainty  from  the  remaining  ground  truth,  a  sub-category 
consisting  of  un-excluded  clutter  was  created.  This  category  consists  of  time  periods 
wherein  significant  clutter  was  present;  however  it  differs  from  the  previously  discussed 
excluded  clutter  because  in  this  instance  the  Ziphius  signal  remained  discernable.  By 
distinguishing  the  cluttered  time  periods  from  the  non-cluttered  time-periods,  two  distinct 
sets  of  statistics  were  able  to  be  generated  for  the  detector  performance  analysis.  Figure  7 
is  an  example  where  competing  signals  were  present;  yet,  the  Ziphius  signal  was  still  able 
to  be  distinguished  among  the  clutter.  A  total  of  20.83  minutes  were  designated  as  un¬ 
excluded  clutter. 

4.  Interclick  Interval 

During  the  creation  of  the  ground  truth,  an  unexpected  observation  was  made  with 
respect  to  the  ICI.  Previous  research  has  cited  an  ICI  of  approximately  0.4  s  for  Cuvier’s 
beaked  whales  (Johnson  et  al,  2004,  Zimmer  et  al,  2005,  Pavan  et  ah,  2006,  Tyack  et  al, 
2006).  The  data  inspected  to  create  the  ground  truth  consistently  displayed  Ziphius 
vocalizations  with  a  discernable  ICI  of  approximately  0.05  s.  A  possible  explanation  for 
this  striking  difference  could  be  that  these  are  different  animals  vocalizing  intermittently. 
It  is  also  possible  that  these  are  not  Cuvier’s  beaked  whales.  However,  the  average  0.05  s 
to  0.1  s  ICI  appears  to  be  regular  and  is  repeated  constantly  throughout  the  dataset. 
Figure  10  depicts  one  of  these  time  periods  within  the  ground  truth  with  a  distinctive  and 
regular  ICI.  This  particular  example  has  an  ICI  of  0.05  s  which  is  not  an  uncommon 
observation.  The  order  of  magnitude  difference  between  the  referenced  literature  and 
these  observations  was  unexpected.  Further  exploration  of  the  ICI  dynamics  is  tangential 
and  beyond  the  scope  of  this  research. 
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Ground  Truth  Creation  Spectrogram  for  Record  129:  Window  509:  Start  Time:  02-23-2008  09:23:25.067 


Cooresponding  Time  Series  of  the  Ground  Truth  Creation 


Figure  10.  Ground  truth  example  with  a  distinct  0.05  s  ICl;  The  upper  panel  is  a 

spectrogram  and  the  lower  panel  is  the  corresponding  time  series  of  the  data  that 
was  reviewed  to  create  the  ground  truth.  The  solid  black  lines  on  the  spectrogram 
at  22.5  and  35  kHz  are  representative  of  the  ground  truth’s  energy  criterion.  The 
dashed  blue  lines  on  the  spectrogram  and  blue  stars  on  the  time  series  are 
representative  of  click  identifications.  This  figure  depicts  an  ICl  of  0.05  s. 
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III.  DETECTOR  PERFORMANCE  RESULTS 


The  performance  of  the  automated  passive  acoustic  matched-filter  detector  was 
assessed  by  statistically  comparing  the  detector’s  output  to  the  ground  truth.  A  detector 
output  hit  that  corresponded  to  a  ground  truth  click  identification  was  a  correct  hit.  A 
detector  output  hit  that  did  not  correspond  to  a  ground  truth  click  identification  was  a 
false  alarm.  A  ground  truth  click  identification  that  did  not  have  an  associated  detector 
output  hit  was  a  miss.  A  correct  rejection  occurred  when  there  were  no  detector  output 
hits  and  no  ground  truth  click  identifications.  Probabilities  of  detection  (P(D))  and 
probabilities  of  false  alarms  (P(FA))  were  calculated  by  the  following  equations: 


P(D)= 


H 

H+M 


(2) 


P(FA)= 


FA 

FA+CR 


(3) 


where,  H  is  the  number  of  correct  hits,  FA  is  the  number  of  false  alarms,  M  is  the  number 
of  misses,  and  CR  is  the  number  of  correct  rejections. 

By  calculating  the  P(D)  and  P(FA)  at  varying  threshold  levels.  Receiver 
Operating  Characteristic  (ROC)  curves  were  created.  The  ROC  curves  are  shown  in 
Figure  11.  Table  1  displays  the  detector  performance  results  at  varying  thresholds  which 
were  used  to  create  the  ROC  curves.  At  an  acceptable  P(FA)  of  0.1%,  the  automated 
passive  acoustic  matched-filter  detector  had  an  overall  P(D)  of  44%.  The  P(D)  increased 
as  the  threshold  was  lowered;  however,  this  detection  improvement  also  increased  the 
P(FA).  The  tradeoff  between  P(D)  and  P(FA)  is  an  important  factor  to  consider  when 
utilizing  the  detector.  The  category  of  data  being  processed  by  the  detector  also  affected 
the  P(D)  and  P(FA)  rate.  As  described  in  the  previous  chapter,  the  data  was  separated 
into  two  distinct  categories  for  further  detector  assessment.  At  an  acceptable  P(FA)  of 
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0.1%;  the  detector  had  a  P(D)  of  61%  and  26%  in  uncluttered  and  cluttered  data, 
respectively.  The  detector  had  a  lower  P(FA)  when  processing  the  uncluttered  data  in 
comparison  to  the  cluttered  data. 


ROC  curves 


P(FA) 


Figure  11.  ROC  curves  to  assess  the  detector’s  performance;  The  orange  curve  is  the 
overall  performance  of  the  detector,  combining  both  the  uncluttered  and  cluttered 
time  periods.  The  detector  performed  best  during  the  uncluttered  time  periods, 
shown  by  the  green  line.  The  detector  performance  was  degraded  during  the 
cluttered  time  periods,  shown  by  the  blue  line. 


DETECTOR  PERFORMANCE  RESULTS 

UNCLUTTERED 

CLUTTERED 

COMBINED 

THRESHOLD 

P(D) 

P(FA) 

P(D) 

P(FA) 

P(D) 

P(FA) 

5.00E-04 

86.0815% 

0.6355% 

92.1336% 

8.8121% 

89.6917% 

1.8614% 

1.00E-03 

79.1634% 

0.3016% 

90.3412% 

5.7345% 

85.8094% 

1.1161% 

1.25E-03 

66.7453% 

0.1314% 

86.7028% 

3.2769% 

78.6127% 

0.6030% 

1.50E-03 

55.2777% 

0.0703% 

80.7665% 

2.0337% 

70.4451% 

0.3646% 

1.75E-03 

46.2646% 

0.0416% 

74.0090% 

1 .3750% 

62.7993% 

0.2415% 

2.00E-03 

39.0050% 

0.0259% 

67.9076% 

0.9926% 

56.2827% 

0.1709% 

3.00E-03 

20.1885% 

0.0059% 

49.4160% 

0.3792% 

37.7543% 

0.0619% 

5.00E-03 

7.4456% 

0.0010% 

26.3202% 

0.0996% 

19.0952% 

0.0158% 

Table  1 .  Automated  passive  acoustic  matched-filter  detector  performance  results  for  the 

uncluttered,  cluttered,  and  combined  time  periods. 
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Figure  12  is  an  example  of  a  cluttered  time  period.  The  detector  output  statistics 
are  depicted  for  two  different  threshold  levels,  which  are  accentuated  with  a  horizontal 
orange  line  in  the  middle  and  bottom  panels.  The  dashed  blue  lines  on  the  spectrograms 
indicate  the  location  of  the  ground  truth  selections.  When  the  threshold  level  is  set  at 
0.005,  as  in  the  middle  panel  of  Figure  12,  the  number  of  false  alarms,  even  in  a  cluttered 
time  period  is  acceptably  low.  The  detector  does  hit  on  several  of  the  ground  truths; 
however,  at  this  threshold  the  detector  misses  even  more  Ziphius  clicks  than  it  correctly 
detects.  When  the  threshold  is  lowered  by  an  order  of  magnitude,  as  in  the  bottom  panel, 
the  detector  is  able  to  accurately  hit  each  of  the  ground  truths  with  zero  misses.  The 
tradeoff  is  the  significant  increase  in  false  alarms  because  the  threshold  level  is  now 
located  within  the  clutter.  These  low  values  of  detector  output  may  contain  Ziphius 
clicks;  however,  the  statistics  declare  these  as  false  alarms  when  compared  to  the  ground 
truth.  The  ground  truth  is  a  conservative  estimate  because  of  the  inability  to  visually 
detect  a  Ziphius  click  when  it  is  embedded  in  the  noise.  The  actual  statistical  output  for 
the  cluttered  data  would  contain  fewer  false  alarms  if  it  were  being  compared  to  a  perfect 
ground  truth.  The  cluttered  ROC  curve  would  be  shifted  significantly  to  the  left  in  the 
case  of  a  perfect  ground  truth. 
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Detector  Output  Detector  Output 


Corresponding  Detector  Output:  5E*3  Threshold 


Corresponding  Detector  Output:  5E-4  Threshold 


Figure  12.  Detector  output  statistics  for  a  cluttered  time  period;  The  upper  panel  is  the 
spectrogram  with  ground  truth  click  identifications  marked  by  the  dashed  blue 
line.  The  middle  panel  is  the  corresponding  detector  output  for  a  given  threshold 
and  the  bottom  panel  is  the  corresponding  detector  output  for  a  lowered  threshold. 
The  threshold  level  is  denoted  by  the  solid  orange  line  on  the  middle  and  bottom 
panels.  In  the  middle  panel,  the  detector  misses  several  of  the  ground  truth  click 
identifications;  however,  the  false  alarm  rate  is  very  low.  The  detector  is  able  to 
hit  all  of  the  ground  truth  click  identifications  with  no  misses  when  the  threshold 
is  at  the  lowest  level;  however,  there  is  a  significant  increase  in  false  alarms. 
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In  comparison  to  the  cluttered  time  periods,  the  detector  output  statistics  indicated 
much  lower  false  alarm  rates  when  the  detector  was  processing  uncluttered  data.  Figure 
1 3  is  an  uncluttered  example  wherein  the  false  alarm  rate  remains  low  even  at  a  threshold 
level  of  0.0005.  The  ground  truth  for  the  uncluttered  time  periods  is  also  conservative  in 
comparison  to  what  a  perfect  ground  truth  would  indicate.  As  in  the  cluttered  data,  this 
inherent  flaw  causes  a  resultant  increase  in  the  false  alarm  rate.  The  availability  of  a 
perfect  ground  truth  would  serve  to  lower  the  P(FA)  and  shift  the  uncluttered  ROC  curve 
to  the  left.  However,  it  is  not  as  significant  of  a  shift  as  would  occur  with  the  cluttered 
data  ROC  curve. 

The  unavoidable  ground  truth  bias  does  not  alone  account  for  the  detector’s 
performance  failures.  Even  in  uncluttered  data,  the  detector  has  displayed  limitations 
when  in  the  presence  of  a  vocalizing  Cuvier’s  beaked  whale.  Figure  13  exemplifies  the 
detector’s  failure  to  hit  a  ground  truth  even  at  the  lowest  analyzed  threshold  level.  In  this 
instance,  the  detector  correctly  hits  10  of  11  ground  truths  within  a  distinct  click  pattern. 
Unexpectedly,  the  detector  fails  to  correctly  hit  one  of  the  seemingly  stronger  clicks 
within  the  click  train.  This  statistical  miss  could  potentially  be  a  consequence  of 
multipaths  or  environmental  effects.  The  cross-correlation  of  the  kernel  to  the  SCORE 
click  ensemble,  shown  in  previously  in  Eigure  5,  verifies  this  resultant  decrease  in  the 
correlation  value  when  multipath  effects  are  present.  However,  it  can  most  likely  be 
attributed  to  the  low  signal  to  noise  ratio  (SNR). 
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Figure  13.  Uncluttered  data  example  portraying  the  detector’s  limitations  with  a  low 
SNR;  The  upper  panel  is  the  spectrogram  of  an  uncluttered  time  period,  with  the 
ground  truth  click  identifications  emphasized  with  the  dashed  blue  line.  The 
lower  panel  is  the  corresponding  detector  output  for  a  threshold  of  5E-4,  which  is 

depicted  with  the  solid  orange  line 


Another  shortcoming  of  the  detector  is  its  performance  when  other  marine 
mammals  are  vocalizing  within  the  same  time  period  as  Cuvier’s  beaked  whales.  Figure 
14  is  a  designated  cluttered  time  period  wherein  there  appears  to  be  delphind  activity  as 
well  as  Ziphius  clicks.  The  detector  performed  well  to  hit  each  of  the  ground  truths  at  a 
threshold  of  0.001,  shown  in  the  bottom  panel;  however,  it  also  hits  on  the  apparent 
delphind  clicks.  The  correlation  values  for  the  non-Ziphius  vocalizations  vary  throughout 
the  time  period  which  makes  it  difficult  to  select  a  threshold  that  will  still  detect  the 
Cuvier’s  clicks  while  correctly  rejecting  the  undesired  vocalizations.  Increasing  the 
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DtUctor  Output  Ff*q(H^ 


threshold,  shown  in  the  middle  panel  of  Figure  14,  improves  the  detector’s  performance 
by  dramatically  lowering  the  number  of  false  alarms.  However,  at  this  particular 
threshold  level,  there  are  several  missed  detections. 


Spectrogram  for  Record  133:  Window  3«:  SUil  Time:  02.23.2008  09:44:35.490 


Figure  14.  Detector  performance  in  the  presence  of  delphinid  activity:  The  top  panel  is  a 
spectrogram  with  the  ground  truth  identifications  marked  with  dashed  blue  lines. 
The  middle  and  lower  panels  display  the  corresponding  detector  output  at  a  given 
threshold,  marked  with  the  orange  line.  The  P(D)  is  better  at  the  lower  threshold; 
however,  the  P(FA)  increases  as  well. 
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The  detector  performance  was  degraded  when  other  marine  mammals  vocalized 
within  the  same  time  period  as  a  Cuvier’s.  In  spite  of  this,  the  detector  performed  well  in 
the  presence  of  multiple  Ziphius.  Figure  15  portrays  an  ICI  that  is  approximately  one  half 
the  routinely  observed  0.05s  ICI.  The  shortened  ICI  and  alternating  magnitude  strengths 
on  the  spectrogram  suggest  that  there  are  two  Cuvier’s  beaked  whales  vocalizing 
intermittently.  This  example  also  depicts  the  conservative  bias  inherent  to  the  ground 
truth.  The  false  alarms  in  the  initial  portion  of  this  window  most  likely  contain  Ziphius 
clicks  that  were  visually  indiscernible. 


Corresponding  Detector  Output:  IE-3  Threshold 
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Figure  15.  Detector  performance  in  the  presence  of  two  Cuvier’s  beaked  whales;  The 
upper  panel  is  the  spectrogram  and  the  ground  truth  click  identifications  are 
emphasized  with  the  dashed  blue  lines.  The  lower  panel  is  the  corresponding 
detector  output.  The  detector  performs  well  in  the  presence  of  multiple  Ziphius 
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The  detector  performance  was  also  analyzed  during  time  periods  where  no 
Ziphius  activity  was  observed.  The  detector  performed  perfectly  in  these  instances  where 
the  ground  truth  contained  zero  clicks.  The  respective  detector  output  statistics  indicated 
zero  hits,  zero  misses,  and  zero  false  alarms  at  all  analyzed  thresholds  during  these 
known  quiet  periods. 
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IV.  CONCLUSIONS 


The  unique  spectral  and  temporal  structures  of  Cuvier’s  beaked  whales’ 
vocalizations  are  favorable  for  automated  detection  via  a  matched-filter.  A  kernel  was 
generated  for  two  different  types  of  acoustic  recording  devices:  a  HARP  and  a  SCORE 
hydrophone.  The  kernel,  that  was  generated  from  data  band-pass  filtered  between  15  - 
40  kHz,  had  a  390  ps  duration.  This  is  slightly  greater  than  the  click  durations  cited  in 
recent  research:  175  ps  (Johnson  et  ah,  2004),  200  ps  (Zimmer  et  ah,  2005),  250  ps 
(Johnson  et  ah,  2004  and  Tyack  et  ah,  2006),  and  300  ps  (Pavan  et  ah,  2006).  This 
difference  can  likely  be  attributed  to  the  available  bandwidth  of  the  acoustic  recording 
instrument  or  the  nature  of  the  comparison.  An  acoustic  recording  instrument  with  a 
narrower  bandwidth  would  capture  a  shorter  duration  of  the  click  than  an  instrument  with 
a  wider  bandwidth.  Also,  this  is  not  a  direct  click  to  click  comparison.  The  kernel  is  a 
compilation  of  150  different  clicks  that  were  statistically  analyzed  to  extract  one 
dominant  component,  which  accounted  for  73%  of  the  variance. 

The  consistently  observed  ICI  in  this  study  was  approximately  0.05  s.  This 
observation  is  in  disagreement  with  other  recently  published  research:  0.38  s  (Pavan  et 
ah,  2006),  0.40  s  (Johnson  et  ah,  2004  and  2006,  and  Tyack  et  ah,  2006),  0.43  s  (Zimmer 
et  ah,  2005).  The  ICI  was  not  the  focus  of  this  project.  It  was,  however,  a  consistently 
observed  phenomenon  during  the  ensemble  and  ground  truth  creation.  The  difference  of 
an  entire  order  of  magnitude  is  a  significant  result.  One  possible  explanation  is  that  there 
were  multiple  animals  vocalizing  intermittently.  However,  the  extremely  concise  and 
repetitive  intervals  are  suggestive  of  a  single  animal.  It  is  also  possible  that  these 
vocalizations  are  made  by  a  species  other  than  a  Ziphius  cavirostris  or  that  this  species 
simply  vocalizes  at  varying  ICIs.  Further  exploration  of  this  unexpected  disparity  was 
beyond  the  scope  of  this  research. 

A  total  of  174.8  minutes  of  data  from  NPS’s  DAS  recordings  at  a  SCORE 
hydrophone  were  reviewed.  Specific  criteria  were  adhered  to  in  an  attempt  to  limit 
subjectivity.  The  objective  selection  criteria  included:  spectrogram  energy  between  22.5 
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and  35  kHz  and/or  a  distinctive  click  train  pattern,  and  a  single  selection  of  a  cluster. 
Following  this  criteria,  28,434  clicks  were  selected  for  inclusion  in  the  ground  truth. 
Time  segments  when  the  data  recordings  were  interrupted  or  when  the  signal  could  not  be 
confidently  discerned  due  to  indistinguishable  clutter  were  removed.  28.89  minutes  were 
purposefully  excluded  from  the  statistical  analysis.  The  remaining  ground  truth  was  then 
separated  into  categories  of  cluttered  and  non-cluttered  data  to  further  distinguish  the 
ROC  curves. 

Despite  all  attempts  to  produce  a  precise  ground  truth,  it  was  an  inherently 
conservative  estimate.  At  times,  signals  could  not  be  visually  discerned  that  the  detector 
was  able  to  detect.  The  cluttered  data  times  were  affected  by  this  prejudice  more  so  than 
the  uncluttered  data  times.  During  the  cluttered  time  periods,  actual  signals  became 
hidden  with  the  noise;  thus,  causing  misses  in  the  ground  truth.  These  ground  truth 
misses  became  detector  false  alarms  in  the  detector  evaluation.  If  this  bias  could  be 
removed,  the  detector  performance  would  be  improved.  The  detector’s  performance  in 
cluttered  time  periods  would  improve  significantly  as  compared  to  a  slight  improvement 
during  the  uncluttered  time  periods. 

At  an  acceptable  false  alarm  rate  of  0.1%:  the  overall  detector’s  P(D)  is  44%.  The 
detector  performed  best  in  uncluttered  time  periods  with  a  61%  P(D)  for  a  0.1%  false 
alarm  rate  The  detector’s  performance  degrades  in  cluttered  data:  the  detector  has  a  P(D) 
of  26%  at  a  0.1%  false  alarm  rate.  The  detector  performance  is  perfect  in  the  absence  of 
clicks.  The  detector  does  not  distinguish  well  between  non-ziphid  type  vocalizations  and 
Cuvier’s  beaked  whales’  vocalizations. 

The  greatest  problem  for  the  detector  is  the  significant  number  of  false  alarms 
from  other  than  desired  marine  mammals.  The  detector  definitely  detects  clicks. 
However,  it  cannot  be  absolutely  certain  as  to  what  species’  clicks  are  being  detected  due 
to  the  inability  to  visually  discern  the  differences  at  the  time  scale  used.  The  kernel  that 
was  developed  from  the  second  sub-set  of  the  ensemble,  which  was  band-pass  filtered 
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from  15-60  kHz,  was  not  utilized  or  assessed  in  this  researeh.  Assessing  this  second 
kernel  with  HARP  data  recordings  can  provide  further  insight  as  to  the  competency  of  the 
detector. 

Potential  follow-on  research  that  could  build  upon  the  premises  established  in  this 
thesis  includes: 

*  An  in-depth  investigation  of  the  ICI  disparities  between  this  thesis  and  other 
research 

*  Increasing  the  ensemble  sample  size,  performing  a  PCA,  and  then  comparing  the 
resultant  kernel  to  the  kernel  used  in  this  research 

*  With  the  availability  of  an  enhanced  kernel,  repeating  the  detector  performance 
analysis 

*  Applying  the  detector  to  other  SCORE  hydrophones  within  the  NPS  DAS 

*  Duplicating  this  research  with  the  unevaluated  kernel  and  assessing  the  detector’s 
performance  when  processing  NPS  and/or  SIO  HARP  data 

*  Comparing  temporally  coincident  detector  results  from  a  SCORE  hydrophone 
with  results  from  the  nearby  SIO  site  H  HARP 

*  Assessing  the  classification  performance  of  the  kernel  to  correctly  identify  a 
Ziphius  click  from  other  marine  mammals’  vocalizations 

*  A  study  utilizing  the  optimum  detector  to  assess  the  geographic  call  density 
distribution 

*  A  study  utilizing  the  optimum  detector  to  assess  the  seasonal  and/or  diurnal 
variability  call  patterns 


33 


THIS  PAGE  INTENTIONALLY  LEET  BLANK 


34 


LIST  OF  REFERENCES 


Barlow,  J.,  1999.  Trackline  detection  probability  for  long-diving  whales.  209-221.  In: 
G.W.  Garner,  S.C.  Amstrup,  J.L.  Laake,  B.R.J.  Manly,  L.L.  McDonald  and  D.G. 
Robertson  (eds.)  Marine  Mammal  Survey  and  Assessment  Methods.  Balkema 
Press,  Rotterdam,  Netherlands. 

Barlow,  J.,  M.C.  Ferguson,  W.F.  Perrin,  L.  Ballance,  T.  Gerrodette,  G.Joyce,  C.D. 
Macleod,  K.  Mullin,  D.L.  Palka  and  G. Waring.  2006.  Abundance  and  densities  of 
beaked  whales  and  bottlenose  whales  (family  Ziphiidae).  J.  Cetacean  Res. 
Manage,  7(3),  263-270. 

Cox,  T.  M.,  T.  J.  Ragen,  A.  J.  Read,  E.  Vos,  R.  W.  Baird,  K.  Balcomb,  J.  Barlow,  J. 
Caldwell,  T.  Cranford,  L.  Crum,  A.  D’Amico,  G.  D’spain,  A.  Fem'Andez,  J. 
Finneran,  R.  Gentry,  W.  Gerth,  F.  Gulland,  J.  Hildebrand,  D.  Houser,  T.  Hullar,  P. 
D.  Jepson,  D.  Ketten,  C.  D.  Macleod,  P.  Miller,  S.  Moore,  D.  C.  Mountain,  D. 
Palka,  P.  Ponganis,  S.  Rommel,  T.  Rowles,  B.  Taylor,  P.  Tyack,  D.  Wartzok,  R. 
Gisiner,  J.  Mead  and  L.  Benner.  2006.  Understanding  the  impacts  of 
anthropogenic  sound  on  beaked  whales.  J. Cetacean  Res.  Manage,  1,  177-187. 

Federal  Register,  Decision  Memorandum  Accepting  Alternative  arrangements  for  the 
U.S.  Navy’s  Southern  California  Operating  Area  Composite  Training  Unite 
Exercises  (COMPTUEXs)  and  Joint  Task  Eorce  Exercises  (JTEEXs)  Scheduled  to 
Occur  Between  Today  and  January  2009.  Vol.  73:  No.  16,  January  24,  2008. 

Eerguson,  M.C.,  J.  Barlow,  S.B.  Reilly,  and  T.  Gerrodette.  2006.  Predicting  Cuvier’s 
{Ziphius  cavirostris)  and  Mesoplodon  beaked  whale  population  density  from 
habitat  characteristics  in  the  eastern  tropical  Pacific  Ocean.  J.  Cetacean  Res. 
Manage,  7(3):287-299. 

Hastings,  Mardi  C,  2008.  Coming  to  Terms  With  the  Effects  of  Ocean  Noise  on  Marine 
Mammals.  Acoustics  Today,  4(2),  21-33. 

Hildebrand,  J.A.  2005.  Impacts  of  Anthropogenic  Sound.  Marine  Mammal  Research 
Conservation  Beyond  Crisis.  J.E.  Reynolds  III,  W.E.  Perrin,  R.R.  Reeves,  S. 
Montgomery,  and  T.J.  Ragen.  Johns  Hopkins  University  University  Press, 
Baltimore,  MD.  101-124. 

Hildebrand,  J.A.,  Report  to  CNO  (N45):  Marine  Mammal  Acoustic  Monitoring  and 
Habitat  Investigation,  Southern  California  Offshore  Region.  November  2007. 

Johnson,  M.,  P.T.  Madsen,  N.  Aguilar  Soto,  W.M.X.  Zimmer,and  P.E.  Tyack.  2004. 
Beaked  Whales  Echolocate  Eor  Prey.  Proc.  R.  Soc.  Lond.  B  271,  S383-S386. 


35 


Johnson,  M.,  P.T.  Madsen,  W.M.X.  Zimmer,  N.  Aguilar  Soto,  and  P.L.  Tyaek.  2006. 
Foraging  Blainville’s  beaked  whales  (Mesoplodon  densirostris)  produee  distinet 
eliek  types  matehed  to  different  phases  of  eeholoeation.  The  Journal  of 
Experimental  Biology,  209,  5038-5050. 

Miller,  Christopher  W.,  2008.  Naval  Postgraduate  Sehool.  Personal  eommunieation  with 
regards  to  the  development  of  an  automated  passive  aeoustie  matehed-filter 
deteetion  seheme. 

National  Resourees  Couneil,  Oeean  Noise  and  Marine  Mammals:  Committee  on  Potential 
Impaets  of  Ambient  Noise  in  the  Oeean  on  Marine  Mammals,  Oeean  Studies 
Board,  Division  on  Earth  and  Life  Studies.,  Washington  DC:  The  National 
Aeademies  Press,  2003. 

Pavan,  G.,  C.  Fossati,  M.  Priano,  M.  Manghi.  2006.  Report  to  the  58^  IWC  Seientifie 
Committee:  Reeording  Cuvier's  beaked  whales  {Ziphius  cavirostris)  with  a 
wideband  towed  array.  SC/58/E18. 

Shaw,  P.,  Multivariate  statistics  for  the  Environmental  Sciences.  2003.  London:  Hodder- 
Amold. 

Tyaek,  P.  L.,  M.  Johnson,  N.  Aguilar  Soto,  A.  Sturlese,  and  P.T.  Madsen.  2006.  Extreme 
diving  of  beaked  whales.  The  Journal  of  Experimental  Biology,  209,  4238-4253. 

Wiggins,  S.  M.  and  J.A.  Hildebrand.  2007.  High-frequeney  Aeoustie  Reeording  Paekage 
(HARP)  for  broad-band,  long-term  marine  mammal  monitoring.  Symposium  on 
Underwater  Teehnology  and  Workshop  on  Seientifie  Use  of  Submarine  Cables 
and  Related  Teehnologies,  2007.  17-20  April  2007,  551-557. 

Wiggins,  S.  M.,  Deeember  2007,  Seripps  Institution  of  Oeeanography.  Personal 
eommunieation  via  email  with  regards  to  TRITON  software. 

Zimmer,  W.M.X.,  M.P.  Johnson,  June  2005.  Eeholoeation  elieks  of  free-ranging  Cuvier’s 
beaked  whales  {Ziphius  cavirostris).  J.  Acoustic  Soc.  Am,  117(6),  3919-3927. 


36 


INITIAL  DISTRIBUTION  LIST 


1.  Defense  Technical  Information  Center 
Ft.  Belvoir,  VA 

2.  Dudley  Knox  Library 
Naval  Postgraduate  School 
Monterey,  CA 

3.  Ching-Sang  Chiu 

Naval  Postgraduate  School 
Monterey,  CA 

4.  Christopher  W.  Miller 
Naval  Postgraduate  School 
Monterey,  CA 

5.  John  E.  Joseph 

Naval  Postgraduate  School 
Monterey,  CA 

6.  Curtis  A.  Collins 

Naval  Postgraduate  School 
Monterey,  CA 

7.  Tetyana  Margolina 
Naval  Postgraduate  School 
Monterey,  CA 

8.  CDR  Rebecca  Stone 
Naval  Postgraduate  School 
Monterey,  CA 

9.  Katherine  Whitaker 
Whitaker  LTD 
Pacific  Grove,  CA 

10.  DonBrutzman 

Naval  Postgraduate  School 
Monterey,  CA 


37 


1 1 .  Samantha  Poteete 

Naval  Postgraduate  School 
Monterey,  CA 

12.  Sean  M.  Wiggins 

Scripps  Institution  of  Oceanography 
La  Jolla,  CA 

13.  John  A.  Hildebrand 

Scripps  Institution  of  Oceanography 
La  Jolla,  CA 

14.  Erin  Oleson 

Scripps  Institution  of  Oceanography 
La  Jolla,  CA 

15.  E.  Elizabeth  Henderson 

Scripps  Institution  of  Oceanography 
Ea  Jolla,  CA 

16.  Mark  McDonald 
Whale  Acoustics 
Bellvue,  CO 

17.  Dave  Mellinger 
Oregon  State  University 
Newport,  OR 

18.  Heidi  Nevitt 

SCORE  Operations  Center 
San  Diego,  CA 

19.  Frank  Stone 
CNO  N45 
Washington,  DC 

20.  Ernie  Young 
CNO  N45 
Washington,  DC 

21.  Holly  Burd 
Palumbo  Contracting 
Eower  Burrell,  PA 


38 


22.  Joy  Burd 

Happy  Canary  Productions 
Lower  Burrell,  PA 

23 .  Carl  Mohamed 
Pittsburgh  Post-Gazette 
Pittsburgh,  PA 

24.  Jan  T.  Mohamed 
Confident  Vision 
Dallas,  TX 

25.  Jessiea  Rose  Mohamed 
SPAWAR  Spaee  Field  Aetivity 
Chantilly,  VA 

26.  Barbara  Coeolin 
Management  Speeialists 
Fleetville,  PA 


39 


